507 research outputs found
The devices, experimental scaffolds, and biomaterials ontology (DEB): a tool for mapping, annotation, and analysis of biomaterials' data
The size and complexity of the biomaterials literature makes systematic data analysis an excruciating manual task. A practical solution is creating databases and information resources. Implant design and biomaterials research can greatly benefit from an open database for systematic data retrieval. Ontologies are pivotal to knowledge base creation, serving to represent and organize domain knowledge. To name but two examples, GO, the gene ontology, and CheBI, Chemical Entities of Biological Interest ontology and their associated databases are central resources to their respective research communities. The creation of the devices, experimental scaffolds, and biomaterials ontology (DEB), an open resource for organizing information about biomaterials, their design, manufacture, and biological testing, is described. It is developed using text analysis for identifying ontology terms from a biomaterials gold standard corpus, systematically curated to represent the domain's lexicon. Topics covered are validated by members of the biomaterials research community. The ontology may be used for searching terms, performing annotations for machine learning applications, standardized meta-data indexing, and other cross-disciplinary data exploitation. The input of the biomaterials community to this effort to create data-driven open-access research tools is encouraged and welcomed.Preprin
Statistical Mechanical Development of a Sparse Bayesian Classifier
The demand for extracting rules from high dimensional real world data is
increasing in various fields. However, the possible redundancy of such data
sometimes makes it difficult to obtain a good generalization ability for novel
samples. To resolve this problem, we provide a scheme that reduces the
effective dimensions of data by pruning redundant components for bicategorical
classification based on the Bayesian framework. First, the potential of the
proposed method is confirmed in ideal situations using the replica method.
Unfortunately, performing the scheme exactly is computationally difficult. So,
we next develop a tractable approximation algorithm, which turns out to offer
nearly optimal performance in ideal cases when the system size is large.
Finally, the efficacy of the developed classifier is experimentally examined
for a real world problem of colon cancer classification, which shows that the
developed method can be practically useful.Comment: 13 pages, 6 figure
Reproducing Kernels of Generalized Sobolev Spaces via a Green Function Approach with Distributional Operators
In this paper we introduce a generalized Sobolev space by defining a
semi-inner product formulated in terms of a vector distributional operator
consisting of finitely or countably many distributional operators
, which are defined on the dual space of the Schwartz space. The types of
operators we consider include not only differential operators, but also more
general distributional operators such as pseudo-differential operators. We
deduce that a certain appropriate full-space Green function with respect to
now becomes a conditionally positive
definite function. In order to support this claim we ensure that the
distributional adjoint operator of is
well-defined in the distributional sense. Under sufficient conditions, the
native space (reproducing-kernel Hilbert space) associated with the Green
function can be isometrically embedded into or even be isometrically
equivalent to a generalized Sobolev space. As an application, we take linear
combinations of translates of the Green function with possibly added polynomial
terms and construct a multivariate minimum-norm interpolant to data
values sampled from an unknown generalized Sobolev function at data sites
located in some set . We provide several examples, such
as Mat\'ern kernels or Gaussian kernels, that illustrate how many
reproducing-kernel Hilbert spaces of well-known reproducing kernels are
isometrically equivalent to a generalized Sobolev space. These examples further
illustrate how we can rescale the Sobolev spaces by the vector distributional
operator . Introducing the notion of scale as part of the
definition of a generalized Sobolev space may help us to choose the "best"
kernel function for kernel-based approximation methods.Comment: Update version of the publish at Num. Math. closed to Qi Ye's Ph.D.
thesis (\url{http://mypages.iit.edu/~qye3/PhdThesis-2012-AMS-QiYe-IIT.pdf}
Angular sensitivity of blowfly photoreceptors: intracellular measurements and wave-optical predictions
The angular sensitivity of blowfly photoreceptors was measured in detail at wavelengths λ = 355, 494 and 588 nm.
The measured curves often showed numerous sidebands, indicating the importance of diffraction by the facet lens.
The shape of the angular sensitivity profile is dependent on wavelength. The main peak of the angular sensitivities at the shorter wavelengths was flattened. This phenomenon as well as the overall shape of the main peak can be quantitatively described by a wave-optical theory using realistic values for the optical parameters of the lens-photoreceptor system.
At a constant response level of 6 mV (almost dark adapted), the visual acuity of the peripheral cells R1-6 is at longer wavelengths mainly diffraction limited, while at shorter wavelengths the visual acuity is limited by the waveguide properties of the rhabdomere.
Closure of the pupil narrows the angular sensitivity profile at the shorter wavelengths. This effect can be fully described by assuming that the intracellular pupil progressively absorbs light from the higher order modes.
In light-adapted cells R1-6 the visual acuity is mainly diffraction limited at all wavelengths.
Generative Models and Model Criticism via Optimized Maximum Mean Discrepancy
We propose a method to optimize the representation and distinguishability of samples from two probability distributions, by maximizing the estimated power of a statistical test based on the maximum mean discrepancy (MMD). This optimized MMD is applied to the setting of unsupervised learning by generative adversarial networks (GAN), in which a model attempts to generate realistic samples, and a discriminator attempts to tell these apart from data samples. In this context, the MMD may be used in two roles: first, as a discriminator, either directly on the samples, or on features of the samples. Second, the MMD can be used to evaluate the performance of a generative model, by testing the model's samples against a reference data set. In the latter role, the optimized MMD is particularly helpful, as it gives an interpretable indication of how the model and data distributions differ, even in cases where individual model samples are not easily distinguished either by eye or by classifier
A Novel Visual Word Co-occurrence Model for Person Re-identification
Person re-identification aims to maintain the identity of an individual in
diverse locations through different non-overlapping camera views. The problem
is fundamentally challenging due to appearance variations resulting from
differing poses, illumination and configurations of camera views. To deal with
these difficulties, we propose a novel visual word co-occurrence model. We
first map each pixel of an image to a visual word using a codebook, which is
learned in an unsupervised manner. The appearance transformation between camera
views is encoded by a co-occurrence matrix of visual word joint distributions
in probe and gallery images. Our appearance model naturally accounts for
spatial similarities and variations caused by pose, illumination &
configuration change across camera views. Linear SVMs are then trained as
classifiers using these co-occurrence descriptors. On the VIPeR and CUHK Campus
benchmark datasets, our method achieves 83.86% and 85.49% at rank-15 on the
Cumulative Match Characteristic (CMC) curves, and beats the state-of-the-art
results by 10.44% and 22.27%.Comment: Accepted at ECCV Workshop on Visual Surveillance and
Re-Identification, 201
Using data mining for wine quality assessment
Certification and quality assessment are crucial issues within
the wine industry. Currently, wine quality is mostly assessed by physico-
chemical (e.g alcohol levels) and sensory (e.g. human expert evaluation)
tests. In this paper, we propose a data mining approach to predict wine
preferences that is based on easily available analytical tests at the certifi-
cation step. A large dataset is considered with white vinho verde samples
from the Minho region of Portugal. Wine quality is modeled under a re-
gression approach, which preserves the order of the grades. Explanatory
knowledge is given in terms of a sensitivity analysis, which measures the
response changes when a given input variable is varied through its do-
main. Three regression techniques were applied, under a computationally
efficient procedure that performs simultaneous variable and model selec-
tion and that is guided by the sensitivity analysis. The support vector
machine achieved promising results, outperforming the multiple regres-
sion and neural network methods. Such model is useful for understand-
ing how physicochemical tests affect the sensory preferences. Moreover,
it can support the wine expert evaluations and ultimately improve the
production
Elastic Maps and Nets for Approximating Principal Manifolds and Their Application to Microarray Data Visualization
Principal manifolds are defined as lines or surfaces passing through ``the
middle'' of data distribution. Linear principal manifolds (Principal Components
Analysis) are routinely used for dimension reduction, noise filtering and data
visualization. Recently, methods for constructing non-linear principal
manifolds were proposed, including our elastic maps approach which is based on
a physical analogy with elastic membranes. We have developed a general
geometric framework for constructing ``principal objects'' of various
dimensions and topologies with the simplest quadratic form of the smoothness
penalty which allows very effective parallel implementations. Our approach is
implemented in three programming languages (C++, Java and Delphi) with two
graphical user interfaces (VidaExpert
http://bioinfo.curie.fr/projects/vidaexpert and ViMiDa
http://bioinfo-out.curie.fr/projects/vimida applications). In this paper we
overview the method of elastic maps and present in detail one of its major
applications: the visualization of microarray data in bioinformatics. We show
that the method of elastic maps outperforms linear PCA in terms of data
approximation, representation of between-point distance structure, preservation
of local point neighborhood and representing point classes in low-dimensional
spaces.Comment: 35 pages 10 figure
- …